AITopics

2603.18483

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Cao, Yong-Syun, Imori, Shinpei, Ing, Ching-Kang

High-Dimensional Importance-Weighted Information Criteria: Theory and Optimality

arXiv.org Machine LearningMay-13-2025

V arious methods for high-dimensional model selection have been developed in recent years to address situations where the training and test data come from different distributions. When both input and output variables are available in the source (training) and target (test) domains but the target sample size is small, estimates based solely on the target data often suffer from high variance. To improve accuracy, auxiliary estimates from the source domain can be incorporated, along with bias correction to account for domain differences. This transfer learning strategy facilitates more reliable estimation under limited target information (see, for example, Li et al. (2021), Bastani (2021), and Tian and Feng (2022)). However, when test outputs (i.e., target responses) are unavailable, estimation or bias correction involving both domains becomes infeasible, as only inputs (covariates) are observed in the test set.

artificial intelligence, lemma 4, machine learning, (16 more...)

2505.06531

Country: Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Machine LearningMar-6-2025

Golden Ratio Weighting Prevents Model Collapse

He, Hengzhi, Xu, Shirong, Cheng, Guang

Recent studies identified an intriguing phenomenon in recursive generative model training known as model collapse, where models trained on data generated by previous models exhibit severe performance degradation. Addressing this issue and developing more effective training strategies have become central challenges in generative model research. In this paper, we investigate this phenomenon theoretically within a novel framework, where generative models are iteratively trained on a combination of newly collected real data and synthetic data from the previous training step. To develop an optimal training strategy for integrating real and synthetic data, we evaluate the performance of a weighted training scheme in various scenarios, including Gaussian distribution estimation and linear regression. We theoretically characterize the impact of the mixing proportion and weighting scheme of synthetic data on the final model's performance. Our key finding is that, across different settings, the optimal weighting scheme under different proportions of synthetic data asymptotically follows a unified expression, revealing a fundamental trade-off between leveraging synthetic data and generative model performance. Notably, in some cases, the optimal weight assigned to real data corresponds to the reciprocal of the golden ratio. Finally, we validate our theoretical results on extensive simulated datasets and a real tabular dataset.

err, estimation, synthetic data, (15 more...)

2502.18049

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

arXiv.org Artificial IntelligenceFeb-20-2025

Purest Quantum State Identification

Yu, Yingqi, Chen, Honglin, Wu, Jun, Xie, Wei, Li, Xiangyang

Precise identification of quantum states under noise constraints is essential for quantum information processing. In this study, we generalize the classical best arm identification problem to quantum domains, designing methods for identifying the purest one within $K$ unknown $n$-qubit quantum states using $N$ samples. %, with direct applications in quantum computation and quantum communication. We propose two distinct algorithms: (1) an algorithm employing incoherent measurements, achieving error $\exp\left(- \Omega\left(\frac{N H_1}{\log(K) 2^n }\right) \right)$, and (2) an algorithm utilizing coherent measurements, achieving error $\exp\left(- \Omega\left(\frac{N H_2}{\log(K) }\right) \right)$, highlighting the power of quantum memory. Furthermore, we establish a lower bound by proving that all strategies with fixed two-outcome incoherent POVM must suffer error probability exceeding $ \exp\left( - O\left(\frac{NH_1}{2^n}\right)\right)$. This framework provides concrete design principles for overcoming sampling bottlenecks in quantum technologies.

algorithm, error probability, quantum state, (14 more...)

2502.14334

Country:

Asia > China > Anhui Province > Hefei (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Wu, Zhiyuan, Choi, Changkyu, Cao, Xiangcheng, Cevher, Volkan, Ramezani-Kebrya, Ali

Addressing Label Shift in Distributed Learning via Entropy Regularization

arXiv.org Artificial IntelligenceFeb-4-2025

We address the challenge of minimizing true risk in multi-node distributed learning. These systems are frequently exposed to both inter-node and intra-node label shifts, which present a critical obstacle to effectively optimizing model performance while ensuring that data remains confined to each node. To tackle this, we propose the Versatile Robust Label Shift (VRLS) method, which enhances the maximum likelihood estimation of the test-to-train label density ratio. VRLS incorporates Shannon entropy-based regularization and adjusts the density ratio during training to better handle label shifts at the test time. In multi-node learning environments, VRLS further extends its capabilities by learning and adapting density ratios across nodes, effectively mitigating label shifts and improving overall model performance. Experiments conducted on MNIST, Fashion MNIST, and CIFAR-10 demonstrate the effectiveness of VRLS, outperforming baselines by up to 20% in imbalanced settings. These results highlight the significant improvements VRLS offers in addressing label shifts. Our theoretical analysis further supports this by establishing high-probability bounds on estimation errors.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2502.02544

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Artificial IntelligenceSep-23-2023

Tight bounds on Pauli channel learning without entanglement

Chen, Senrui, Oh, Changhun, Zhou, Sisi, Huang, Hsin-Yuan, Jiang, Liang

Entanglement is a useful resource for learning, but a precise characterization of its advantage can be challenging. In this work, we consider learning algorithms without entanglement to be those that only utilize separable states, measurements, and operations between the main system of interest and an ancillary system. These algorithms are equivalent to those that apply quantum circuits on the main system interleaved with mid-circuit measurements and classical feedforward. We prove a tight lower bound for learning Pauli channels without entanglement that closes a cubic gap between the best-known upper and lower bound. In particular, we show that $\Theta(2^n\varepsilon^{-2})$ rounds of measurements are required to estimate each eigenvalue of an $n$-qubit Pauli channel to $\varepsilon$ error with high probability when learning without entanglement. In contrast, a learning algorithm with entanglement only needs $\Theta(\varepsilon^{-2})$ rounds of measurements. The tight lower bound strengthens the foundation for an experimental demonstration of entanglement-enhanced advantages for characterizing Pauli noise.

classical-memory-assisted scheme, pauli channel, tr 2, (15 more...)

2309.13461

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Hardware (0.88)

arXiv.org Artificial IntelligenceApr-11-2023

Analytic theory for the dynamics of wide quantum neural networks

Liu, Junyu, Najafi, Khadijeh, Sharma, Kunal, Tacchino, Francesco, Jiang, Liang, Mezzacapo, Antonio

Parameterized quantum circuits can be used as quantum neural networks and have the potential to outperform their classical counterparts when trained for addressing learning problems. To date, much of the results on their performance on practical problems are heuristic in nature. In particular, the convergence rate for the training of quantum neural networks is not fully understood. Here, we analyze the dynamics of gradient descent for the training error of a class of variational quantum machine learning models. We define wide quantum neural networks as parameterized quantum circuits in the limit of a large number of qubits and variational parameters. We then find a simple analytic formula that captures the average behavior of their loss function and discuss the consequences of our findings. For example, for random quantum circuits, we predict and characterize an exponential decay of the residual training error as a function of the parameters of the system. We finally validate our analytic results with numerical experiments.

artificial intelligence, machine learning, tr 2, (17 more...)

doi: 10.1103/PhysRevLett.130.150601

2203.16711

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > British Indian Ocean Territory > Diego Garcia (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Liu, Xinghan, Lorini, Emiliano, Rotolo, Antonino, Sartor, Giovanni

Modelling and Explaining Legal Case-based Reasoners through Classifiers

arXiv.org Artificial IntelligenceDec-8-2022

This paper brings together two lines of research: factor-based models of case-based reasoning (CBR) and the logical specification of classifiers. Logical approaches to classifiers capture the connection between features and outcomes in classifier systems. Factor-based reasoning is a popular approach to reasoning by precedent in AI & Law. Horty (2011) has developed the factor-based models of precedent into a theory of precedential constraint. In this paper we combine the modal logic approach (binary-input classifier, BLC) to classifiers and their explanations given by Liu & Lorini (2021) with Horty's account of factor-based CBR, since both a classifier and CBR map sets of features to decisions or classifications. We reformulate case bases of Horty in the language of BCL, and give several representation results. Furthermore, we show how notions of CBR, e.g. reason, preference between reasons, can be analyzed by notions of classifier system.

artificial intelligence, atm 0, machine learning, (15 more...)

2210.11217

Country:

Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (0.40)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Utpala, Saiteja, Sriperumbudur, Bharath K.

Shrinkage Estimation of Higher Order Bochner Integrals

arXiv.org Machine LearningJul-21-2022

We consider shrinkage estimation of higher order Hilbert space valued Bochner integrals in a non-parametric setting. We propose estimators that shrink the $U$-statistic estimator of the Bochner integral towards a pre-specified target element in the Hilbert space. Depending on the degeneracy of the kernel of the $U$-statistic, we construct consistent shrinkage estimators with fast rates of convergence, and develop oracle inequalities comparing the risks of the the $U$-statistic estimator and its shrinkage version. Surprisingly, we show that the shrinkage estimator designed by assuming complete degeneracy of the kernel of the $U$-statistic is a consistent estimator even when the kernel is not complete degenerate. This work subsumes and improves upon Krikamol et al., 2016, JMLR and Zhou et al., 2019, JMVA, which only handle mean element and covariance operator estimation in a reproducing kernel Hilbert space. We also specialize our results to normal mean estimation and show that for $d\ge 3$, the proposed estimator strictly improves upon the sample mean in terms of the mean squared error.

artificial intelligence, estimator, machine learning, (17 more...)